A soft-computation hybrid method for search of the antibiotic-resistant gene in Mycobacterium tuberculosis for promising drug target identification and antimycobacterial lead discovery

Summary Tuberculosis (TB) control programs were already piloted before the COVID-19 pandemic commenced and the global TB response was amplified by the pandemic. To combat the global TB epidemic, drug repurposing, novel drug discovery, identification and targeting of the antimicrobial resistance (AMR) genes, and addressing social determinants of TB are required. The study aimed to identify AMR genes in Mycobacterium tuberculosis (MTB) and a new anti-mycobacterial drug candidate. In this research, we used a few software to explore some AMR genes as a target protein in MTB and identified some potent antimycobacterial agents. We used Maestro v12.8 software, along with STRING v11.0, KEGG and Pass Server databases to gain a deeper understanding of MTB AMR genes as drug targets. Computer-aided analysis was used to identify mtrA and katG AMR genes as potential drug targets to depict some antimycobacterial drug candidates. Based on docking scores of –4.218 and –6.161, carvacrol was identified as a potent inhibitor against both drug targets. This research offers drug target identification and discovery of antimycobacterial leads, a unique and promising approach to combating the challenge of antibiotic resistance in Mycobacterium, and contributes to the development of a potential futuristic solution.


Introduction
Mycobacterium tuberculosis (MTB) is the bacterium responsible for causing Tuberculosis (TB), which is considered as world's most lethal infectious disease agent, particularly in developing countries such as South-East Asia, India, Africa, Western Pacific, Indonesia, China, Eastern Mediterranean, Philippines, Pakistan, Nigeria, Bangladesh, South Africa, America and Europe (Chakaya et al., 2021). This bacterium spreads rapidly in crowded areas and exacerbates the transmission (Pereira et al., 2005). TB primarily affects the respiratory system and symptoms include persistent coughing, chest discomfort, coughing up blood, weakness, weight loss, fever and night sweats (Zaman, 2010). The World Health Organization (WHO) has been publishing an annual report on TB prevalence since 1996, which represents over 99% of the global population from 198 countries and territories (Chakaya et al., 2021). In 2019, there are approximately 10.0 million people worldwide who are suffering from TB, leading to 1.2 million deaths among HIVnegative individuals and 208 000 deaths among HIV-positive individuals (Zaman, 2010). According to recent estimates, approximately 1.7 billion people worldwide are assessed to be infected with MTB. Based on the estimates of latently infected people by TB, it will cause 16.3 and 8.3 active tuberculosis cases for every 100 000 people in 2030 and 2050 (Chakaya et al., 2021). Globally, over half a million people have been affected by rifampicin-resistant TB, while 78% have been affected by multidrug-resistant (MDR) TB, according to the WHO report (https://www.who.int/teams/global-tuberculosis-programme/tbreports/global-tuberculosis-report2022/tb-disease-burden/2-3drug-resistant-tb) (Chakaya et al., 2022). MDR strains are resistant to anti-tuberculosis drugs like isoniazid (Inh), rifampicin (Rif), ethambutol (E) and pyrazinamide; whereas extensive drug-resistant (XDR) strains are resistant to injectable secondline anti-tuberculosis drugs like amikacin, kanamycin and capreomycin (Casadevall, 2017;Pontali et al., 2019). A study by Ballell et al. (2005) found that MTB prevalence and deaths are increasing exponentially despite efficient first-line drugs.
TB remains a significant global health problem, with a high burden in many parts of the world. According to the latest data, the highest prevalence of TB is in the South-East Asia region, which accounts for 44% of the total cases worldwide. India also has a high burden, with 26% of the cases, while Africa reports 25% of the global TB burden. The Western Pacific region follows, contributing 18% of the cases, with Indonesia at 8.5% and China at 8.4%. The Eastern Mediterranean region reports 8.2% of the cases, with the Philippines at 6%, Pakistan at 5% and Nigeria at 4.4%. Notably, both Bangladesh and South Africa have a high burden of 3.6%. The American region reports 2.9% of the cases, while Europe has the lowest burden at 2.5% (Chakaya et al., 2021). Therefore, TB therapy today requires the use of several bactericidal and sterilizing drugs over long periods to eradicate active MTB, while simultaneously inhibiting the development of antibiotic resistance in existing bacteria (Ma et al., 2020;Nathanson et al., 1977). Consequently, it is necessary to develop new therapeutic agents, which will be effective against MDR and XDR bacteria without causing adverse effects during long-term use in the long run.
Throughout the world, plants are being used to treat a wide range of diseases because they have a large number of bioactive metabolites. Plant extracts have been proven to have immunomodulatory potentials in a variety of cell lines and animal models, apart from their anti-tuberculosis capabilities. It is estimated that approximately 75% of recognized antiinfective drugs are derived from medicinal plants because they are comparatively less toxic and safe (Cragg et al., 1997). Currently, more than 350 plant species have been investigated for their potential applications in treating tuberculosis in traditional medicine (Eldeen and van Staden, 2007). Phenolic compounds are one of the major classes that can be used in treating many diseases. Plant-derived phenolics can inhibit the growth and activity of many microbes, including many clinically important bacteria, fungi, protozoa and viruses. Therefore, we have selected the phenolic compounds to understand their antimicrobial efficacy against MTB in this study. According to research, phenolic compounds are promising inhibitors and drug candidates for TB and over the past 17 years, phenolic compounds are found to be potent candidates in the treatment of MTB (Kumar et al., 2022;Mazlun et al., 2019;Tyring, 2012). There are a variety of plants such as Phyllanthus embilica, Bauhinia racemose and others that produce such secondary metabolites (Prabhu et al., 2017(Prabhu et al., , 2021. In addition, anticancer drugs such as Taxol and Vinblastine are derived from Catharanthus roseus and Taxus brevifolia, a traditional Chinese medicinal plant (Veeresham, 2012). An estimated 25% of all FDA-approved drugs over the past 20 years have been based on natural products or their derivatives, according to a review published in 2018 (Patridge et al., 2016;Thomford et al., 2018). Furthermore, 65% of the world's population uses organic herbal formulations to treat stomach pain, poison bite, dengue, Corona and respiratory disorders, including Caricca papaya, Andrographis paniculata and Calotropis procera (Farnsworth et al., 1985).
The use of natural products has been successfully integrated into the healthcare system because of its pharmacological potential and it is also estimated that 95% of public hospitals in China have traditional medicine departments. In the search for medicine from different classes of natural molecules, the pharmacological significance of natural phenolic compounds against other antimicrobial diseases is quite experienced 18 ; however, their antimycobacterial effect is still unfamiliar to us. Therefore, we used computational analysis to investigate phenolic compounds' antimycobacterial potential and develop effective antimycobacterial pharmaceutical candidates in the future. We propose a hybrid novel approach that combines data mining and molecular docking to identify MTB AMR genes as a drug target and phenolic compounds as potential inhibitors of TB.

Identification of AMR genes
To identify AMR genes, present in M.tuberculosis, CARD (Comprehensive Antibiotic Resistance Database) (Chen et al., 2020) has been used as a resistance gene identifier. CARD (https://card.mcmaster.ca/ontology/39867) provides quick information about AMR genes, and their potential, and correlates with the actual resistance trait of bacteria (Camiade et al., 2020;Hendriksen et al., 2019;Thomas et al., 2017). Gene family, drug class, resistance mechanism and protein homology have been analyzed and AMR genes of MTB were shortlisted.

Search of MTB AMR gene in the human genome as non-homologous gene
The FASTA sequences of shortlisted AMR genes were downloaded using the Uniprot website by searching for the gene name through the search box at https://www.uniprot.org/. Gene sequences have been analyzed for their homology concerning the human genome using Basic Local Alignment Search Tool (BLAST).

Target determination and subcellular localization of selected MTB AMR genes
AMR has played a noteworthy role in the number of human deaths due to TB in particular (Castro et al., 2021); hence, they are a promising target in drug development. To be considered a drug target, proteins must be important to the pathogen's survival within a host's body, as well as not being homologous to those of the host. The non-homologous AMR genes were selected to prevent the cross-binding of the drug to the host protein, which would result in a greater probability of adverse effects (Sakharkar et al., 2004). Localization and protein family are the major parameters for drug target analysis. The AMR gene sequences were obtained from https:// card.mcmaster.ca/ontology/39867. CELLO v.2.5: subCELlular LOcalization predictor tool was used to determine a gene's subcellular location. A query protein sequence (FASTA format) has been submitted to http://cello.life.nctu. edu.tw/. Based on the analysis report, localization, reliability parameters and functional protein family were identified using PFA (protein family analysis). As part of the query box and website https://pfam.xfam.org/search/sequence, the query protein FATSA sequence has been inserted.

Functional process and pathway analysis of MTB AMR genes
The metabolic pathway information has been retrieved from the Kyoto Encyclopedia of Genes and Genomes (KEGG) database (Kanehisa et al., 2022). The assigned ID of M.tuberculosis and gene name has been retrieved via https:// www.genome.jp/kegg/pathway.html. KEGG is a broadspectrum source of metabolic pathways information that helps to identify unique proteins and to explore metabolic pathways and their respective protein sequence.

Protein-protein interaction
Protein-protein interaction was predicted using STRING, an online server tool that integrates both known and predicted protein-protein interactions (Szklarczyk et al., 2021). To seek potential interaction between our AMR genes, the STRING tool has been employed by putting gene names and organism names on the query site in https://STRING-db.org/cgi/input? sessionId=biiK4a6BFLeg&input_page_show_search=on. For active interactions, scores > 0.4 were applied and Cytoscape software version 3.6.1 has been used to visualize the PPI networks.

Antigenicity and allergenicity evolution of AMR target proteins
The VexiJen v2.0 software was used to predict the antigenicity and allergenicity of the AMR proteins. The FASTA sequences of individual genes were used as query sequences and sequences with a score over 0.4 are considered antigenic proteins. Allergenicity and antigenicity play important roles in disease diagnosis, the target protein can be antigen but not allergen (Zhang and Tao, 2015).

Collection and categorization of phenolic compounds from a natural source and their antimycobacterial activity analysis
Based on a literature survey, phenolic compounds were identified and classified using PubChem. The anti-mycobacterial activity was assessed using Prediction of Activity Spectra for Substances (PASS) online. PASS Online (http://www.way2 drug.com/passonline/predict.php) is a web-based tool to predict the biological activity of compounds. It utilizes computational models and algorithms to analyze chemical structures and predict their activity profiles. Compounds with an antimycobacterial activity score of more than 0.5 were selected for docking analysis.

Natural phenolics-target protein interaction analysis
As part of the docking analysis of phenolic-target protein interactions, Schrodinger Maestro 12.8 software has been utilized. The phenolic compounds have been shortlisted based on their antimycobacterial activity and 5 compounds were selected for docking analysis.

Ligand preparation
To obtain optimized structures, five selected phenolic compounds were prepared for ligand preparation using a 2D sketcher workspace on Schrodinger Maestro 12.8.

Macromolecule/protein selection and preparation
Protein Data Bank (PDB) is the global archive for accessing the 3D structures of biological macromolecules, as it contains information about the majority of proteins efficiently and instantly (Burley et al., 2017). The crystal structure of our selected target proteins was retrieved from PDB (https://www. rcsb.org/) in PDB format. Only two AMR proteins (mtrA and katG) have PDB structures, so the selected compounds were docked with these two proteins to determine their docking potential and hydrogen bond interactions. For protein preparation, the task protein preparation wizard has been used in Schrodinger Maestro 12.8 software. All water molecules and ligands present in proteins were removed and the active binding site has been generated.

Site mapping and grid generation
A molecular docking technique involves finding a ligandbinding region on a protein. The active site of target proteins has been generated after protein preparation. The sitemap in Maestro 12.8 has been used and grid sites were generated for ligand docking.

Molecular docking
For depth, analysis of intermolecular interaction and binding mode between the target proteins of MTB and plant phenolic compounds, the molecular docking analysis studied was executed for selected ligands using the Maestro 12.8 software. The interactions between active sites in the target protein and ligand include the type of interaction and bond distances.

ADMET profiling
ADMET profiling is one of the most important tools in early drug discovery for the identification of active lead compounds (Tsaioun et al., 2009). The ADMETlab2.0 and admetSAR tools were used to evaluate the pharmacokinetic properties of the top phytochemicals (Cheng et al., 2012;Xiong et al., 2021). Absorption, metabolism, distribution, excretion and toxicity (ADMET) are pharmacokinetic properties of a drug accessed in the body.

Results
The AMR genes which were non-homologous to the human genome were selected and their subcellular localization, functional family and pathways involved were identified. The selection of target genes was done by protein-protein interaction analysis and their antigenicity evaluation. Then we evaluated the phenolic compounds by their antimycobacterial activity identification, the phenolic compounds were shortlisted based on their activity score (more than 0.5). The intermolecular interaction (phenolic compounds and target proteins from the AMR list) was done using molecular docking analysis.

Identification and selection of AMR genes
In this study, the CARD analysis tool was used to analyze the AMR genes of MTB. 10 strict hit genes were observed in the MTB genome as shown in Table 1. These genes contribute to AMR via efflux mechanism and antibiotic target protection, alteration and inactivation. The efflux mechanism is classified into major facilitator superfamily (MFS) and resistance nodulation cell division (RND). The antibiotic target protection/alteration/inactivation was classified according to different classes like antibiotic resistance (quinolone, isoniazid, rifamycin) and gene mutation (23S rRNA with mutation conferring resistance to macrolide antibiotics, Erm 23S ribosomal RNA methyltransferase, murA transferase). These changes in genes confer that AMR was achieved through detoxification, which involves the export of a variety of toxic compounds outside of the cell adaptation of the host cell environment and temperature. Furthermore, the AMR genes present in MTB can be structurally homologous to the human genome. Therefore, we performed BLAST of MTB AMR genes with the human genome to identify sequence homology. All AMR genes were non-homologous except efpA and rpoB genes as shown in Table 2. The non-homologous AMR genes in humans were selected for further analysis.

Navigating the subcellular localization, protein function and pathways analysis
Localization of target protein becomes important since proteins can be found in many different locations of the cell. Further, protein function was also analyzed. According to the result, mfpA and mtrA confer resistance to clarithromycin. Mycobacterium tuberculosis intrinsic murA conferring resistance to fosfomycin and M. tuberculosis katG mutations conferring resistance to isoniazid and they are identified as cytoplasmic proteins (Table 2). Cytoplasmic proteins have multiple functional roles in drug resistance mechanisms  ( Table 2) and they are considered favorable drug targets. The molecular pathways of these AMR genes were identified and they are involved in drug metabolisms, transportation and biosynthesis pathways as shown in Table 3.

Network construction and target protein analysis
The PPI network of MTB AMR genes was generated and analyzed using the STRING database, as depicted in Figure 1. In our analysis, we observed that four AMR proteins [mtrA, katG, erm (37) and murA] showed significant interactions with each other (P-value: < 1.0e-16). Consequently, mtrA and katG were selected for further docking analysis due to the availability of their crystal structure in PDB. To gain better insights, we further assessed the antigenicity and allergenicity of the proteins. However, none of the proteins, except murA, displayed allergenic characteristics (Table 3). It was determined that all proteins, except RbpA, exhibited antigenic properties, however, they do not show any allergenic reaction  (except murA). All proteins were found as good drug targets based on antigenicity, allergenicity, pathway analysis and PPI analysis (Tables 2 and 3).

Identification of antimycobacterial phenolic compounds
To understand the antimycobacterial probabilities of selected phenolic compounds, we used the PASS server. In this analysis, many compounds were showing antimycobacterial activity toward MTB. Table 4 revealed the probability to be an antimicrobial and anti-mycobacterial drug candidate. A Pvalue of >0.5 was used to filter drug probabilities, PASS predicted that carvacrol, Limonene, p-Coumaric acid prenyl ester, 4-aminocinnamic acid, 4-nitrocinnamic acid were good inhibitors against MTB, showing P-value of >0.5. These five compounds are selected for docking analysis to confirm their antimycobacterial properties.

Molecular docking analysis
By molecular docking, the selected phenolic compounds were evaluated for their effectiveness against MTB AMR proteins. The selected compounds were docked with AMR proteins (mtrA and katG) to determine their docking potential and hydrogen bond interactions. As a result of docking, these phenolic compounds demonstrated a positive binding affinity and acceptable H-bond values with the amino acids of the AMR protein. As a result of the docking metrics described in Table 5, carvacrol, limonene, p-Coumaric acid prenyl ester, 4-aminocinnamic acid and 4-nitrocinnamic acid were   identified as an inhibitor of MTB AMR proteins. Ethambutol is taken as a control in this study.
3.5.1 Active site and grid detection of mtrA and katG Four active sites have been detected in KatG and two sites in mtrA protein. The active site with the highest site score was selected for ligand binding. KatG with a 1.091 site score and mtrA with a 0.976 site score value was selected for the docking process as the target site.

Phenolic compounds against mtrA
Carvacrol, limonene, p-Coumaric acid prenyl ester, 4-aminocinnamic acid and 4-nitrocinnamic acid were docked against mtrA protein. Carvacrol has the highest score -4.218 in reference to ethambutol (-3.410) with 1 H-bond interaction. In this docking complex, the ligand binds with mtrA protein residues GLU126 and ARG167 (Fig. 2). GLU126 forms a hydrogen bond backbone interaction with carvacrol. ARG167 showed p-cation interaction (Table 5, Fig. 2). These bonds are relatively strong and help to stabilize the complex formed between ligand and protein. Each additional hydrogen bond makes the interaction stronger. p-Coumaric acid prenyl ester, 4-aminocinnamic acid and 4-nitrocinnamic acid were also showing good docking scores -3.105, -3.053 and -3.160, respectively. Based on docking score and amino acid interaction, carvacrol has been identified as a potent inhibitor of the mtrA protein of MTB.

Phenolic compound against katG
According to Table 5 and Figure 3, carvacrol has the highest docking score À6.161 with 2 H-bonds interaction, as its docking score is much higher than standard compound ethambutol (-1.739). The amino acid residues of katG such as ARG595 and VAL507 were implicated for their affinity toward carvacrol by forming two hydrogen bonds as shown in Figure 3A and B. This indicated carvacrol could be a good antimycobacterial candidate. Other phenolic compounds have good docking scores with respect to the standard compound ethambutol (-1.739) ( Fig. 3K and L). Limonene has docking scores (-5.075) and showed interaction by no H-bond ( Fig. 3C and D), p-Coumaric acid prenyl ester (-3.587) by 1 H-bond (R595) (Fig. 3E and F), 4-aminocinnamic acid (-2.167) by 1 H-bond (R595) (Fig. 3G and H), and 4-nitrocinnamic acid (-3.945) by 1 H-bond, ionic bond, and covalent bond with amino acid residue D513, D511, R595 ( Fig. 3I and J). In this study, selected phenolic compounds showed good interaction with AMR proteins.

ADMET analysis
Drug candidates can be predicted based on their pharmacokinetic properties ADME and toxicity. ADME and toxicity analysis of selected phenolic compounds were shown in Tables 6 and 7. This process discards many drugs because of their poor pharmacokinetic properties. As a result of the ADME and toxicity profiling (Tables 6 and 7), selected phenolic compounds did not have any side effects associated with absorption except 4-aminocinnamic acid and 4-nitrocinnamic acid. Several ADMET properties associated with potential compounds for various models showed positive results that strongly supported the compounds' potential as drug candidates, including P-glycoprotein substrates, BBB penetration and gastrointestinal absorption. Based on the virtual screening, we identified Carvacrol as a hit anti-TB compound.
Results revealed carvacrol as a leading inhibitor and it could serve as therapeutic inhibitors of MTB targets mtrA and katG.

Conclusions
Only 10% of the symptoms are visible in MTB infection with the rest latent. Among infectious diseases, TB has caused the greatest number of deaths, and AMR is a particular reason.
To develop new antimicrobial agents, the AMR targets must be identified to combat the rapid emergence of antimicrobial resistance among Gram-positive bacteria. The current study identified two AMR genes as targets (mtrA and katG) in MTB (Table 5). We focused on developing new antimycobacterial compounds against both targets because they are involved in metabolic pathways specific to pathogens. We found five phytochemicals that could be utilized in future TB clinical trials. Carvacrol has been identified with a significant docking score against both AMR genes (higher than ethambutol). We investigated the interaction of carvacrol with the amino acid residues of targets and found that it had the highest affinity for binding. Additional pharmacological and toxicological testing has been performed on antimycobacterial drug candidates as they could be used to treat TB (Table 6 and 7). Selected natural phenolic molecules possess a wide range of biological potential and pharmacological properties. This is the first study to show that they could combat AMR    targets of MTB. According to the research output, carvacrol has the potential to be a powerful antimycobacterial drug. To gain a more understanding of targets and lead compounds, further, some more research work is warranted. Research in this field will continue to explore the identified compounds, optimize their properties, investigate combination therapies, utilize computational approaches, and advance toward clinical trials in the future. It is anticipated that these efforts will contribute to the development of more effective and targeted anti-TB agents for effective treatments as well as the ongoing fight against antimicrobial drug resistance.